Randomized sampling for large zero-sum games
نویسندگان
چکیده
منابع مشابه
Sampling Techniques for Markov Games Approximation Results on Sampling Techniques for Zero-sum, Discounted Markov Games
We extend the “policy rollout” sampling technique for Markov decision processes to Markov games, and provide an approximation result guaranteeing that the resulting sampling-based policy is closer to the Nash equilibrium than the underlying base policy. This improvement is achieved with an amount of sampling that is independent of the state-space size. We base our approximation result on a more...
متن کاملSampling Techniques for Zero-sum, Discounted Markov Games
In this paper, we first present a key approximation result for zero-sum, discounted Markov games, providing bounds on the state-wise loss and the loss in the sup norm resulting from using approximate Q-functions. Then we extend the policy rollout technique for MDPs to Markov games. Using our key approximation result, we prove that, under certain conditions, the rollout technique gives rise to a...
متن کاملA TRANSITION FROM TWO-PERSON ZERO-SUM GAMES TO COOPERATIVE GAMES WITH FUZZY PAYOFFS
In this paper, we deal with games with fuzzy payoffs. We proved that players who are playing a zero-sum game with fuzzy payoffs against Nature are able to increase their joint payoff, and hence their individual payoffs by cooperating. It is shown that, a cooperative game with the fuzzy characteristic function can be constructed via the optimal game values of the zero-sum games with fuzzy payoff...
متن کاملSparse binary zero-sum games
Solving zero-sum matrix games is polynomial, because it boils down to linear programming. The approximate solving is sublinear by randomized algorithms on machines with random access memory. Algorithms working separately and independently on columns and rows have been proposed, with the same performance; these versions are compliant with matrix games with stochastic reward. (Flory and Teytaud, ...
متن کاملZero-sum games with charges
We consider two-player zero-sum games with countably infinite action spaces and bounded payoff functions. The players’ strategies are finitely additive probability measures, called charges. Since a strategy profile does not always induce a unique expected payoff, we distinguish two extreme attitudes of players. A player is viewed as pessimistic if he always evaluates the range of possible expec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Automatica
سال: 2013
ISSN: 0005-1098
DOI: 10.1016/j.automatica.2013.01.062